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In an age when people spend most of their time indoors and smartphones 
become a necessity, there is an increasing demand to navigate user absolute 
position in indoor environments. While global positioning system (GPSs) 
perform well outdoors, their inaccuracy can not be tolerated in places where 
GPS signal is weak or barely detected. This leads to a number of solutions 


which utilize smartphone inertial measurement unit (IMU) to track user 


location. Most IMU-based methods track the trajectory of a person by using 
stride-length and heading estimation. Thus, the accuracy of stride-length 
estimation plays a very important role in these methods. Inspired by recent 
success in the field of computer vision and machine learning, we proposed an 
image-based stride-length estimation method that employs Gramian angular 
field (GAF) in converting accelerometer data into images, and then feed them 
into a convolutional neural network (CNN) to predict the stride-length. We 
evaluate the performance of our proposed method by using a public dataset 
from Qu Wang in his’ GitHub repository (available at 
https://github.com/Archeries/StrideLengthEstimation). The result shows that 
our proposed method is superior in terms of accuracy in one stride and in 
large walking distance than others using only data collected from the 
accelerometer. 
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1. INTRODUCTION 

A method to estimate a person's location without the support of any external infrastructure in the 
environment is known as pedestrian dead reckoning (PDR). This technique only utilizes the inertial 
measurement unit (IMU) sensors (namely, accelerometer, gyroscope, and sometimes magnetometer), which 
are attached or carried by the users. To obtain user relative position, three important values must be extracted, 
which are step event, stride length, and heading. Among the mentioned tasks, stride length estimation (SLE) 
receives attraction from many researchers because this information is valuable not only in positioning but 
also in activity monitoring, and gait analyzing [1]. 

The simple SLE method assumes that people’s average stride length can be represented using a 
constant. This approach is of course not accurate because different people have different stride lengths. Many 
studies on SLE were done with advanced techniques and models, D’1ez in [2] did a survey and divided 
approaches into two classes: direct methods and indirect methods. In the scope of this paper, we examine 
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three main approaches: the first is the biomechanical methods, the second is integration methods, and the 
final is adaptive methods. 

Biomechanical methods utilize gait analysis in their models like Miyazaki's [3], in which he used a 
gyroscope attached to the subject's thigh to measure the angles created by lower limbs during the walking 
motion. When evaluating the step length, the author assumed that the length of the subject legs is already 
known and two steps in the same stride are equal. Potential errors from assumptions are corrected by taking 
advantage of the relationship between stride length and walking velocity. Zijiska in [4] came up with a 
method to calculate stride using lower limb length and the difference in height of the center of mass (COM). 
Another approach was proposed by Weinberg [5], in which he used vertical acceleration to estimate the stride 
length. An attempt to improve Weinberg’s equation like Kang in [6] where he set up another logarithm-based 
formula to combine with the original one with some added constraints. 

The double integration model could be implemented as a strap-down inertial navigation system 
(INS). Li and Young in [7] used a 2-axis accelerometer and a 1-axis gyroscope placed on a subject's shank to 
collect movements. The walking motion is then segmented and converted into a world coordinate frame 
using the angle calculated from gyroscope readings. Kose and colleagues in [8] took an approach that used a 
wavelet-based decomposition method to detect and separate steps from each leg. Then they applied the 
Kalman filter and reverse integration to compute step length. Error in the method is compensated by 
removing the pelvic rotation from the model. Another implementation to correct the sensor error is using the 
null-velocity update point (ZUPT) to reset the integration which can be found in [9]-[12]. 

There are two types of models in the adaptive approach which are parametric and non-parametric 
models. Kim in [13] perform experiments to determine the correlation of stride length and the mean of 
accelerometer signal from that same stride. Considering the method proposed in [14], which focuses on the 
importance of frequency and its linear relationship with the stride length. A similar approach with more 
features added can also be seen in [15]. Methods utilized variance of accelerometer signals to use in their 
model can be found in [16]-[18]. Besides the linear model, Zihajehzadeh in [19] uses gaussian processed 
regression (GPR) to achieve a better result. Much recent research on using non-parametric models like the 
method in [20] took advantage of Neural Network using three different values computed from maxima and 
minima in each stride as features. Hannink in [21] also used CNN but the accelerometer and gyroscope signal 
are normalized to 256 samples per stride. Gu in [22] trained a Stack Autoencoder to learn important features 
from input data, then they are fed to a regression layer for stride estimation. Although much progress was 
made in estimating a person's stride length, existing methods still pose limitations. The drawback of the 
biomechanical methods is that some parameters are required to know beforehand, which might not be 
available. About double integration, the sensor position plays a major role, thus smartphones or other 
electronic devices may not be suitable. Finally, with adaptive models, feature selection is crucial because it 
has a great influence on the performance of the model. After having a relative position of a user, we may then 
combine with some indoor positioning methods to make absolute position prediction to be more accurate 
[23], [24]. 

We took a different approach to solve the mentioned problems and present a unique method to 
estimate stride length. First, we only use accelerometer data from the dataset collected by Wang in [25]. 
Second, it doesn’t require knowing any information about users' height or leg length. Third, to reduce the 
task of feature selection and determine their relationship, data is preprocessed and converted to images using 
the GAF algorithm [26], which has been successfully applied as a time series encoder in [27], [28]. Finally, 
for the task of learning, we used the CNN model due to its flexibility and accuracy. 


2. RESEARCH METHOD 
2.1. GAF algorithm 

In our research, we focus on exploiting the accelerometer due to its ability to collect data related to 
user walking motion. The raw output of an accelerometer can be described as time series and its patterns can 
be extracted to estimate subject stride length. To retain the features of the data, we took a new approach to 
present information using the GAF algorithm proposed by Wang in [26]. Wang algorithm is suitable for 
converting one-dimensional time series data into a two-dimensional array, which can also be interpreted as an 
image. The method is briefly described as follows. 

Suppose that our accelerometer data is in form of a time series X = {x,, X2,...X,}, where n is the 
size of X. First, we would have to rescale data in the range of [-1, 1] as (1). 


a (xj-max(X))+(x;-min(X)) 
LT max(X)—min(X) 


() 
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Where X, is the normalized value of x;; max(X), min(X) is maximum and minimum value of X, 
respectively. The rescale data can be expressed in a polar coordinate system by using the following 
transformation. 


2 
n=7,ieN (2) 


(* = arccos( X,) 
Where @; is the angle, 7; is the radius and N is the number of the data points. The cosine function would 
respond to input value in range of [—1, 1] as [0, 7] 

This representation gave us another way to gain insights into time-series data. We can calculate the 
trigonometric sum/difference among sampling points to determine the time correlation between them. 
Gramian angular summation field (GASF) and Gramian angular difference field (GADF) are defined as (3), 


(4): 
cos(@,+@,) + cos(@,+@,,) 
GASF = ( ". (3) 
cos(@, + 0,) ve cos(@,+,) 
sin(@,+@,) moe sin(@,+@,) 
GADF = ( : Es : (4) 
sin(@, +%,) + sin(@,+9,) 


We utilize this algorithm to transform sensor data into images and the procedure is illustrated in Figure 1. 
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Figure 1. Illustration of GAF algorithm: (a) one dimensional time series data, (b) polar coordinate 
representation, (c) gasf representation 


2.2. Proposed stride length estimation method 
2.2.1. Overall architecture of the method 

We proposed a method for stride length estimation, which consists of three phases as shown in 
Figure 2. The first phase is data preprocessing which handles raw data from the accelerometer sensor through 
filtering, segmentation and convert the signal to images. Inside data preprocessing we have a module called 
time series to image conversion. Its task is to rescale the data, represent data in polar coordinate, then 
construct a GASF or GADF matrix, the input to the CNN is normalized by resizing the GASF matrix to a 
fixed size (128x128). The second phase is training the CNN model using the images and labels extracted 
from the training dataset. Details of the model will be described in the latter section. After training, we use 
that model to predict value from the testing dataset. 


2.2.2. Data preprocessing 

Raw accelerometer sensor data is subject to noise from the shaking of user motion. To reduce the 
noise, we apply the Butterworth low-pass filter with a cutoff frequency equal to 5 Hz and an order equal to 5. 
After accelerometer readings are filtered, they need to be divided into smaller segments. Most of the time, 
this task is performed by a step detector or step counter. To simplify this requirement, we assume that the 
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data was already divided, and each segment presents one stride as can be seen in Figure 3. After 
segmentation, filtered data from each axis will be converted to an image using the GAF algorithm mentioned 


in the previous section. The procedure can be seen in Figure 4. 


Training phase 


+ 
* 
+ 
t 
a 
\ 


N. 
Ss 


Figure 2. Proposed method architecture 
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Figure 4. Data to images conversion procedure: (a) Filtered data of 3 axes, (b) polar coordinate data of 3 
axes, (c) GASFs 


The CNN model requires input images to have a fixed size. However, each stride duration is 
different, which leads to different sizes of the image. Thus, we would have to resize the image to a particular 
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dimension. As stated in the dataset [25], the sampling rate is 100 Hz and each stride contains about 120 
samples, so we chose the size of the image in one axis to be 128x128 to retain the features inside. 
2.2.3. CNN architecture 

Most of the task involving CNN for images is classification. However, in our case we want the 
output to be the stride length so CNN would be treated as a regression model. We designed a simple CNN 
model that consists of 7 layers. First, we apply a convolutional layer (with ReLU activation) to create the 
feature map of the detected features from image input, then to prevent overfitting we use a dropout layer 
(with a rate of 0.3) before features are flattened, we normalize them using a BatchNormalization layer. After 
that 2 fully connected layers are used followed by a neuron that has a linear activation function at the end of 
the model. Except for the last neuron, all layers utilize rectified linear units as their activation function. For 
better illustration, the CNN architecture of each layer is shown in Figure 5. 
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Figure 5. CNN model architecture 


3. EXPERIMENTS AND EVALUATION 
3.1. Distance estimation 

To evaluate the performance when subject travel in large distance, we need to calculate the 
accumulated walking distance. The accumulative distance of the subject is computed as (5): 


D= N15; (5) 
where D is the total traveled distance, §; is the estimation of i*” stride and N is the number of strides. 


3.2. Error evaluation metrics 
To keep consistence among the error metrics used for evaluating, we adopted the evaluation metrics 
from the dataset [25]. The relative stride error is calculated as (6): 
1 s— Oe 
E, = TEM, A x 100 (%) (6) 
N Sj 
where E, denotes the stride length relative error; s; and §; are the actual stride length and the estimated stride 
length of the i*” stride, respectively. The relative distance error is computed as (7): 


N g_yN gy. 
Bq = Pert Berl x 190 (%) (7) 


i=1 5 


where Eg denotes the walking distance relative error; s;,5; are the actual stride length and the estimated stride 
length of the i‘” stride, respectively. 
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3.3. Dataset 

The dataset we chose for training and evaluation was created and presented by Qu Wang in [25]. In 
his dataset, 10000 strides and their parameters were recorded including readings from accelerometer, 
gyroscope, and magnetometer. To better illustrate the dataset, we analyzed the stride length-frequency 
distribution of the whole dataset in Figure 6(a). 

From Figure 6(a), it is observed that most strides fall in the range from 0.2 meters and 3.5 meters, 
which is reasonable as the subject walking at different velocities. However, there exists a case when 
measured stride-length is above 3.5 meters and reaching nearly 30 meters. This happened because Wang’s 
dataset covers several unique scenarios for example when users using escalators or elevators. If we keep 
those unusual data in the dataset, it could create a false pattern which can ruin the model. To prevent this, it is 
important to also implement an activity recognition algorithm to distinguish between different motion 
patterns and scenarios. However, the dataset does not provide us with the label of the movement type or 
subject walking environment, so it is not possible to classify subject unique cases. To simplify the problem 
that we studied, we filter out all the data that is not in the [0.2, 3.5] meter range. After filtering, the dataset 
has 7998 strides left and the distribution is shown in Figure 6(b). 
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Figure 6. Dataset before and after being filtered, (a) before filter, (b) after filter 


Next, we use the stride number to segment the dataset into a series of strides. This series is also 
labeled using the provided stride-length column in the dataset. As the accelerometer is our main concern, 
only signals from the accelerometer are used. To provide data for the training phase and evaluation phase, we 
split the data into the training set, validation set, and evaluation set. We use 5612 steps and 1403 steps for 
training and validating, respectively and the remains for evaluation. 


3.4. Experimental result and analysis 
3.4.1. Model hyperparameters and the performance evaluation 

Our model was built using Keras library. We use Huber as the loss function of the model because it 
is better to outlier than others. For the optimization task, we try several optimizers and found that Adam 
optimizer is the best fit for our model. Besides, to prevent overfitting the model, early stopping was utilized. 
The summary of a model hyperparameter is shown in Table 1. 


Table 1. Proposed model hyperparameter 


Parameter Value and setting 
Loss function Huber 
Optimizer Adam 
Learning rate 0.0001 
Metrics Mean Absolute Error (MAE) 
Batch size 32 
Epoch 100 
Early Stopping 15 


Int J Artif Intell, Vol. 10, No. 4, December 2021: 997 - 1008 


Int J Artif Intell ISSN: 2252-8938 711003 


Figure 7 illustrates the mean absolute error (MAE) and Loss during the training process. The error 
and loss decrease rapidly in the first couple of epochs. Then as the iterations increase, the MAE became 
stable after 60 epochs while loss only needed 20 epochs to reach that state. The model gets the optimal 
performance after 92 epochs with the training loss, validation loss, training MAE, validation MAE values 
equal to 0.00381, 0.00141, 0.0515, and 0.05431 correspondingly. 
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Figure 7. MAE and Loss of training and validation: (a) MAE, (b) loss 


We evaluate the performance of our model using the prepared test set and plot the comparison of 
estimated stride length and the actual value in Figure 8. Figure 9 shows the result of some concrete strides 
from raw signals, to intermediate GAF images and corresponding stride-length prediction. From the figure, 
we can see that our proposed method gives the closest prediction value to the actual ones. 
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Figure 8. Comparison of estimated stride length in proposed method and actual stride length 
(with Validation data) 


3.4.2. Comparison with other models 
From raw signals, we calculate the root square of ax, ay, and a, from the accelerometer sensor, and 
then apply low-pass filter to feed the signals to the GAF transformation. 


A= jaz +03 +02 


For comparison, we implemented 4 models from Kim [13], Yao [16], Shin [17], and Weinberg [4]. These 
models can be briefly described as (8), (9), (10), (11), 
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(8) 
(9) 


(10) 


(11) 


where Ayg,and Amn denote the maximum and minimum acceleration values, respectively; A; is the 
acceleration value at stride i*"; t;_, and t; are the starting and ending moments of time at step i‘”; f is the 
stride frequency and v is the acceleration variance of the step. K,a,B8,c,and y are model coefficients 


identified during the training process. 
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Figure 9. Some strides with raw signal and their image representation and estimated stride-length 


The Shin, Yao, Weinberg, and Kim methods are evaluated using the testing dataset prepared earlier 
which consists of 973 steps. We can clearly see in Figure 10 that Shin and Yao’s estimation is scattered 
around actual value while Weinberg and Kim’s method tend to overestimate. Details of the error over the 
walking distance of our proposed method with others are shown in Table 2. Over the distance of 1300.5799 
(m) our proposed method EF, and Eg are only 4.4378% and 3.1756%, which is the smallest among others. 
This indicates that our model has the best performance while evaluating the error of each stride and over a 
long distance. From Figure 11, it is obvious that our proposed method achieves 80% of strides with the 
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estimation error under 0.08817 meters, while Kim, Yao, Shin, and Weinberg’s are 0.20103, 0.2077, 0.17526, 


and 0.18179 meters, respectively. 
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Figure 10. Comparison of estimated stride length in other methods and actual stride length: (a) Our proposal 
method (b) Shin method, (c) Yao method, (d) Weinberg method, (e) Kim method 
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Table 2. Comparison of proposed method and others 


Real Proposed method — Shin Yao Weinberg Kim 
Distance (m) 1300.5799 = 1341.8809 1361.4525 1366.5864 = 1414.1273  1447.1376 
Error (m) - 41.301 60.8726 66.0065 113.5474 146.5577 
Eq (%) - 3.1756% 4.6804% 5.0751% 8.7305 % 11.2686% 
Stride length error(m) - 0.05827 0.11502 0.13454 0.12325 0.15202 
E, (%) - 4.4378% 8.7553 % 10.1992% —9.3031% 11.5474% 
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Figure 11. CDF of proposed method and others 


4. DISCUSSION 

From the indoor positioning perspective, having an accurate estimation of stride length and travel 
distance opens up new possibilities as many tracking systems rely on SLE. Using our proposed method 
combined with state-of-the-art techniques for step detection and heading estimation, we can minimize the 
error during the process and achieve a highly accurate position of the current users. Furthermore, this study 
could also be used in the field of gait analysis and health monitoring as the stride length of a person is a 
valuable parameter to predict an impaired gait. The main limitation of our proposed method is that it depends 
on heavy computation. As the accelerometer data is under the process of conversion from time series to 
image and passing through the CNN model, it would take a considerable amount of time. This leads to a 
problem that it is difficult for mobile devices' hardware to handle such an amount of work. A better idea is to 
place the system in a centralized server to harness the processing power and reduced the load for mobile 
devices. 

In the future, studies can be done on how to reduce the computational time of the proposed method 
to support real-time tracking applications. The relationship between stride length and data from other sensors 
like gyroscope and magnetometer could be investigated to further improve accuracy. Finally, the lack of 
dataset labels for training should also be addressed since inaccurate data could result in the model learning 
false patterns. Thus, sensor data collecting procedures for stride length need to be rigorously examined so 
that with special moving patterns, the model can tell the difference between them. 


5. CONCLUSION 

In this paper, we have proposed a new method for stride length estimation. By utilizing the GAF 
algorithm, we were able to transform the accelerometer sensor time-series data into images. Then a CNN 
model was designed to estimate stride length given images as its input. We trained and evaluated the 
performance of our model using a public dataset created by Qu Wang. Although this dataset did not satisfy 
our requirements in labeling, it provided us an indicator of how the model performs. Experiments were 
conducted to measure the performance of our model compared to Kim, Yao, Shin, and Weinberg models. 
The experimental results show that the proposed method is better than others. Our model achieved 4.4378% 
in relative stride error and 3.1756% in relative distance error, which is superior compared to the closest 
methods which are 8.7553%, 4.6804%, respectively. 
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